Personality Estimation from Japanese Text

نویسندگان

  • Koichi Kamijo
  • Tetsuya Nasukawa
  • Hideya Kitamura
چکیده

We created a model to estimate personality trait from authors’ text written in Japanese and measured its performance by conducting surveys and analyzing the Twitter data of 1,630 users. We used the Big Five personality traits for personality trait estimation. Our approach is a combination of categoryand Word2Vec-based approaches. For the category-based element, we added several unique Japanese categories along with the ones regularly used in the English model, and for the Word2Vec-based element, we used a model called GloVe. We found that some of the newly added categories have a stronger correlation with personality traits than other categories do and that the combination of the categoryand Word2Vec-based approaches improves the accuracy of the personality trait estimation compared with the case of using just one of them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of Polychlorinated Biphenyls Intake through Fish Oil-Derived Dietary Supplements and Prescription Drugs in the Japanese Population

Background: Oily fish and their extracted oils may be a source of polychlorinated biphenyls (PCBs) which can induce toxic effects on the consumers. The main aim of this survey was estimation of PCBs intake through fish oil-derived dietary supplements and prescription drugs in the Japanese population. Methods: PCBs levels were determined in 20 fish oil-derived dietary supplements and 6 oil-deri...

متن کامل

A Self-Organizing Japanese Word Segmenter using Heuristic Word Identification and Re-estimation

We present a self-organized method to build a stochastic Japanese word segmenter from a small number of basic words and a large amount of unsegmented training text. It consists of a word-based statistical language model, an initial estimation procedure, and a re-estimation procedure. Initial word frequencies are estimated by counting all possible longest match strings between the training text ...

متن کامل

Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields

When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper descri...

متن کامل

No mobile, no life: Self-perception and text-message dependency among Japanese high school students

A survey was conducted to investigate how self-perception of text-message dependency leads to psychological/behavioral symptoms in relation to personality factors. Japanese high school students completed a self-report questionnaire measuring frequency of text-messages, self-perception of textmessage dependency, psychological/behavioral symptoms, extroversion and neuroticism. Self-perception of ...

متن کامل

Japanese Sentence Order Estimation using Supervised Machine Learning with Rich Linguistic Clues

Estimation of sentence order (sometimes referred to as sentence ordering) is one of the problems that arise in sentence generation and sentence correction. When generating a text that consists of multiple sentences, it is necessary to arrange the sentences in an appropriate order so that the text can be understood easily. In this study, we proposed a new method using supervised machine learning...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016